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Abstract 

Quantum universality can be achieved using classically controlled stabilizer operations and 
repeated preparation of certain ancilla states. Which ancilla states suffice for universality? 
This "magic states distillation" question is closely related to quantum fault tolerance. Lower 
bounds on the noise tolerable on the ancilla help give lower bounds on the tolerable noise rate 
threshold for fault-tolerant computation. Upper bounds show the limits of threshold upper- 
bound arguments based on the Gottesman-Knill theorem. 

We extend the range of single-qubit mixed states that are known to give universality, by 
using a simple parity-checking operation. For applications to proving threshold lower bounds, 
certain practical stability characteristics are often required, and we also show a stable distillation 
procedure. 

No distillation upper bounds are known beyond those given by the Gottesman-Knill theorem. 
One might ask whether distillation upper bounds reduce to upper bounds for single-qubit ancilla 
states. For multi-qubit pure states and previously considered two-qubit ancilla states, the answer 
is yes. However, we exhibit two-qubit mixed states that are not mixtures of stabilizer states, but 
for which every postselected stabilizer reduction from two qubits to one outputs a mixture of 
stabilizer states. Distilling such states would require true multi-qubit state distillation methods. 

1 Introduction 

Stabilizer operations, consisting of Clifford group unitaries and Pauli operator measurement and 
eigenstate preparation, suffice for generating interesting, highly entangled quantum states. By the 
Gottesman-Knill theorem, however, they are efficiently classically simulatable and not quantum 
universal [AG04]. What more does one need to obtain quantum universality? A sufficient addi- 
tional operation is any one-qubit unitary that is not a Clifford up to overall phase [Shi03] . Almost 
every n-qubit unitary also suffices. Much less is known, however, about which non-unitary quantum 
channels, such as noisy gates, suffice for universality together with stabilizer operations. If we as- 
sume that the quantum operations are under the adaptive control of a universal classical computer, 
then this question turns out to be a special case of a broader problem: 

Magic states distillation problem. For which quantum states p does stabilizer operations plus 
repeated preparation of p imply quantum universality? 

If repeated preparation of p and stabilizer operations gives universality, we say for short U(p). 
The problem of characterizing the states p for which U{p) holds has its main application in quantum 
fault tolerance. For fault-tolerance schemes based on stabilizer codes, encoded stabilizer operations 
are the easiest operations to implement. Magic states distillation allows extending these operations 
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to a full universal set, provided that noisy enough states p satisfy U(p). Conversely, limits on when 
U(p) can hold can also give limits on fault-tolerance schemes. 

In this paper, we extend the range of single-qubit mixed states p that are known to give univer- 
sality, by using a simple parity-checking operation. However, the question of fully characterizing 
those single-qubit states p for which U{p) holds remains open. For multi-qubit pure states, the 
question U(p)l reduces to the same question for single-qubit pure states. Unfortunately, though, 
as a second result we show that this question for mixed multi-qubit states p generally does not 
reduce to the single-qubit problem. We also study the applications of magic states distillation to 
quantum fault-tolerance schemes. In particular, these applications in practice will require certain 
stability properties, and we present a stable distillation procedure. 

1.1 Single-qubit magic states distillation 

Bravyi and Kitaev [BK05] posed and studied the magic states distillation problem for single-qubit 
states p. Single-qubit states are conveniently parametrized by the Bloch sphere of Figure 1, under 
the correspondence (x, y, z) «-> p(x, y, z) = \{I + xX + yY + zZ). Here X = ( 5 o )j Y = ( i ~o*) > 
Z = ( J -i ) are the Pauli matrices. Bravyi and Kitaev proved that certain p can be efficiently 
distilled with stabilizer operations to be arbitrarily close to the state \H) := cos ||0) + sin <-> 
(^g, ^>0). Knill, Laflamme and Zurek [KLZ98] proved U{\H)). Combining these results gives: 

Theorem 1 ([BK05]). lA(p) holds for single-qubit states p(x , y , z) with 

max{|x| + \z\, \x\ + \y\, \y\ + \z\} > 1.015 (1) 
or \x\ + \y\ + \z\ > 3/V7« 1.134 . (2) 

On the other hand, it is obvious that U{p) does not hold for p{x, y, z) with |x| + \y\ + \z\ < 1, as 
such states are convex combinations of the Pauli eigenstates {(±1,0, 0), (0, ±1, 0), (0, 0, ±1)}. Such 
states we call "stabilizer states," since they can be prepared using stabilizer operations. 

Thus Theorem 1 leaves a gap in between the stabilizer states and the known distillable states. 
What happens in the region between them? Non-adaptive stabilizer operations alone compute 
the class ©L, so are probably not universal even for classical computation [AG04]. One intriguing 
possibility is that stabilizer operations with states p(x, y, z) outside the region of Theorem 1 but 
with |x| + |y| + |z| > 1 could give an intermediate class between BPP and BQP. Another possibility 
is that there is a sharp threshold, i.e., that U(p(x,y, z)) holds exactly when |x| + \y\ + \z\ > 1. 

In fact, Ref. [Rei05] showed that there is indeed a sharp threshold in the xz-plane of the Bloch 
sphere: 

Theorem 2 ([Rei05]). U(p(x,y,z)) holds if 

max{|x| + \z\, \x\ + \y\, \y\ + \z\} > 1 . (3) 

The improvement of Eq. (3) over Eq. (1) is tight, as the states p(x, 0,1 — x) are stabilizer states. 
Theorem 2 also implies: 

Corollary 1 ([Rei05]). U{p) holds for every single-qubit pure state p that is not one of the six 
Pauli eigenstates. 

Pure states correspond to points on the surface of the Bloch sphere, i.e., x 2 + y 2 + z 2 = 1. 

In this paper, we extend the set of single-qubit states p for which we know U{p) slightly further: 



2 




Figure 1: Bloch sphere: Up to a phase, single-qubit states are in correspondence with points on or in 
the unit sphere in R 3 . The state p(x, y, z) = ^(1 + xX + yY + zZ) corresponds to the point (x, y, z). 
Pure states correspond to points on the surface of the sphere. All points p in the octahedron O the 
convex hull of the six Pauli eigenstates can be prepared using stabilizer operations. 



Theorem 3. U(p(x,y,z)) holds if 

max{|x| + \Jy 2 + z 2 , \y\ + \J x 2 + z 2 , \z\ + 

V% 2 + y 2 } > i • (4) 

The basic operation required is a simple parity check, which we introduce in Section 2. Applying 
the parity check in the computational and dual bases, in Section 3, gives the stated improvement 
in the distillable region of states. 

Moreover, we show in Section 4 that the set of distillable states is strictly larger than the set 
delimited by Eqs. (2) and (4): 

Theorem 4. U(p(fx,fy,fz)) holds for x = y = « 0.229, z = « 0.677 and 

f = 0.9895. 

Notice that the values of (x, y, z) in Theorem 4 satisfy Eqs. (2) and (4) with equality, so 
(fx,fy,fz) is a slight but strict improvement. The proof of Theorem 4 comes from a small 
modification of Bravyi and Kitaev's distillation scheme. 

Figure 2 displays the new distillable state results. 

1.2 Multi-qubit state distillation 

In [Rei05], this author considered the question lA{p)l for multi-qubit pure states. For every multi- 
qubit pure state that is not a stabilizer state, there exists a sequence of stabilizer operations that 
reduces the state down to a single-qubit pure state that is not a Pauli eigenstate. From Corollary 1, 
this implies: 
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Figure 2: Open region: In (a) is shown the region of single-qubit states for which lA(p) had not been 
known, bounded in one octant of the Bloch sphere byl<x + y + z< 3/V7 and max{x + y, y + 
z,x + z} < 1. The other octants are symmetrical. The region remaining after Theorem 3, which 
adds the inequality max{x + \/y 2 + z 2 , y + \J x 2 + z 2 , z + \f x 2 + y 2 } < 1, is shown in (b). Part (c) 
shows a cross-section of this octant through the plane x = y. The narrowings of the open region due 
to Theorem 3 are indicated with thick blue curves. The thick red curve shows the improvement 
from the distillation procedure of Theorem 4; the point p = 0.9895( ^g=|r , , ^jZ^jy ) is 

indicated. This curve has been computed numerically, and we do not know a parametric form. 
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Corollary 2 ([Rei05]). U(p) holds for every single- or multi-qubit pure state that is not a stabilizer 
state. 

Corollary 3. Almost every n-qubit unitary, together with stabilizer operations, gives quantum 
universality. 

Corollary 3 follows from Corollary 2 by applying the unitary to |0 n ); the result will almost certainly 
not be a stabilizer state, since there are a finite number of pure n-qubit stabilizer states. 

We here study distillation of two-qubit mixed states. We provide an example of a two-qubit 
state that is not a mixture of stabilizer states, but for which every two-to-one-qubit stabilizer 
reduction outputs a mixture of stabilizer states (Section 5). This implies that, unlike the situation 
for pure states, the question of U(p)l for mixed states does not reduce to the single-qubit case. 

Theorem 5. The state p = \ll + t$(IY + IZ — XX + YX + ZX) is not a mixture of stabilizer 
states, but every two-to-one-qubit stabilizer reduction outputs a mixture of single-qubit stabilizer 
states. 

Theorem 5 is proved by studying the 15-dimensional polytope of two-qubit stabilizer states. 
However, the problem of multi-qubit magic states distillation is still mostly open; for example, we do 
not know if U{p) holds for the p of Theorem 5, nor do we know of any two-qubit distillation procedure 
that cannot start with reduction to a one-qubit state. Dennis has previously considered distillation 
of atwo-qubit state for universality [DenOl]. However, for the state, ^(|00) + |01) + |10)), and noise 
model in [DenOl], as well as for simultaneous depolarizing noise, universality does reduce to the 
single-qubit case. Below the noise level at which the state becomes a mixture of stabilizer states, 
universality can be obtained by measuring the second qubit to be |+) and applying Theorem 2. 

1.3 State distillation and quantum fault tolerance 

This magic states distillation problem is natural from a quantum information point of view, with the 
motivation coming from understanding the gap between classically controlled stabilizer operations 
(BPP) and full quantum universality (BQP). The main application, though, is to lower and upper 
bounds for quantum fault tolerance (Section 7). 

Magic states distillation serves as a reduction from universal fault-tolerant computation down 
to fault-tolerant stabilizer operations. This reduction often works in practice without affecting the 
maximum tolerable noise rate, or threshold, because the bottleneck is in achieving reliable stabilizer 
operations. 

Determining upper bounds on the tolerable noise threshold is a difficult problem. We show that 
two recent upper bounds [VHP05, BCL + 06], based on upper-bounding the amount of noise before a 
gate set becomes classically simulatable by the Gottesman-Knill theorem, are unfortunately tight. 
For example, 

Theorem 6. Classically controlled stabilizer operations, together with repeated application of a ir/8 
gate (exp(i^Z)) subject to worst-case probabilistic noise at rate p (or dephased at twice that rate), 
give universality if and only if p < — -^). 

Classically controlled stabilizer operations, together with repeated application of a ir/8 gate de- 
polarized at rate p, give universality if and only if p < (6 — 2\/2)/7. 

The two "only if parts of Theorem 6 are proved by [VHP05] and [BCL + 06], respectively. The 
first "if part is a consequence of Theorem 2. We prove the second "if part by distilling a certain 
two-qubit state, based on the Jamiolkowski isomorphism. 
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2 Universality of stabilizer operations and preparation of \H) 

This section reviews the proof from Refs. [KLZ98, BK05] of li{\H)) and introduces the parity- 
checking operation used in Section 3 to implement an improved magic states distillation pro- 
cedure. This scheme is similar to the partner-pairing algorithm used in heat-bath algorithmic 
cooling [FLMR04] . 

Definition 1. Stabilizer operations consist of Clifford group unitaries, preparation of |0) and 
measurement in the computational |0),|1) basis. Clifford group unitaries are generated by the 
Hadamard gate H = X ( \ } x ) , the phase gate Z 1 / 2 = ( \ 9 ) and the controlled-NOT gate, CNOT 
\a)\b) = \a)\a + b mod 2) for a,b G {0,1}. 

The CNOT gate and arbitrary single-qubit rotations from SU{2) give universality [BBC + 95], 
from which Boykin et al. proved the universality of CNOT, Hadamard, and Z 1 / 4 [BMP + 00]. The 
intuition is that using Z 1 / 4 and its stabilizer conjugates, like X 1//4 , it is easy to obtain a rotation 
by an irrational multiple of n about some axis, and hence a dense set of rotations about that axis. 
Rotations about a second axis can be obtained by conjugation with a stabilizer operation. 

The proof of U(\H)) follows from a trick to implement exp(i^Z) on a data qubit using repeated 
preparation of exp(i^Z)|+) and stabilizer operations with adaptive classical control. Here |+) = 
^(|0) + |1)) is +1 eigenstate of X. 

Single qubit pure states can be parameterized by polar coordinates (9, 4>) on the Bloch sphere: 
\i/)(0,(f>)) = cosjIO) + e i9i sin^|l), so \H) = \ip(n/4, 0)). Single qubit Clifford unitaries consist 
exactly of the rotational symmetries of the octahedron O of Figure 1, so in particular \i/j(9, 0)) and 
\ip(%,9)) are symmetrical. Now notice that 

(a|0) + £11)) <8> (|0> + e ie \l)) = a|00) + /3e ie |ll) + ae*|01) + /5|10> (5) 

by simply expanding out the tensor product. Therefore, we can apply a CNOT between the qubits 
to measure the parity. On measuring even parity, the operation ( q ?e ) has been applied to the 
remaining qubit. When the parity is odd, ( Q e -ie) has been applied. In the latter case, we repeat 
the process, carrying out a random walk on phases that are integer multiples of 6. Terminate the 
walk when the phase is +9. Note the necessity of adaptive classical control of the quantum circuit. 

This procedure in particular implements Z 1 / 4 oc e l 8 z given repeated preparation of \H), giving 
quantum universality U(\H)). In this case, a random walk is actually not necessary; on measuring 
odd parity, apply the correction Y 1 / 2 . In fact, Shi has extended [BBC+95] to show universality of 
{CNOT, T}, where T is any single-qubit real gate such that T 2 does not preserve the computational 
basis [Shi03]. This implies U(\ip(0,Q))) for almost all 9. 

3 Parity-checking distillation algorithm 

The same parity-checking operation can be applied to single-qubit mixed states. 
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3.1 Case y = 




(6) 



A successful parity check extracts the 2x2 submatrix of the first and last rows and columns. 
Renormalizing this submatrix to have trace one, and converting back into (x, y, z) Bloch sphere 
coordinates, 



Now assume additionally that x = z; this can be achieved by applying a Hadamard gate to 
p with probability 1/2. Take two output qubits from two separate successful parity checks, and 
perform another parity check in the dual basis, i.e., switching x and z. Then after resymmetrizing 
about the x = z axis, we get overall 



It is simple to verify that this function lies above x for 1/2 < x < 0.68. Repeatedly apply this 
scheme to iterate Eq. (8), in each step pairing together successful states from the previous iteration, 
and conditioning on success of all the measurements. The states output by this procedure will not 



obtain universality. This provides a new proof of Theorem 2. 

The procedure above is equivalent to taking four copies of p and conditioning the state to lie 
in the logical subspace of the four-qubit erasure code. In Ref. [Rei05], Theorem 2 is proved by 
using the seven-qubit Steane and 23-qubit Golay codes similarly. This scheme is simpler, but less 
efficient and can only distill \H) indirectly. 

There are a various related distillation schemes. For example, instead of dual-parity-checking 
two copies of the parity check's output, one can dual-parity-check p and the parity check's output — 
equivalent to distillation with a certain t/iree-qubit code. We have checked by exhaustive enumer- 
ation all three-to-one- and four-to-one-qubit distillation protocols, and these schemes appear to be 
optimal, insofar as they maximize x' + z' starting from x = z = \ + 0.001. 

3.2 Case y ^ 

Next consider the case when initially y ^ 0. Of course, symmetrizing about the Hadamard axis 
forces y to 0, but we would like to do better. With a single postselected parity check, 




(7) 




(8) 




However, since 2 • 0.68 > 1.015, we can use Theorem 1 to 




(9) 




Then since 




2xy 



tan 2<p 



(10) 




a postselected parity check accomplishes the mapping 
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Thus, r and z transform just as x and z do in the case y = 0; but also the angle <j> doubles. Now 
if (f> = k-K/2 1 for integers k, I, then repeated doubling will eventually move the state into the y = 
plane. Now, as long as 

r + z = \Jx 2 + y 2 + z>l , (12) 
one can verify with straightforward calculus that 

r' + z' > 1 . (13) 

Therefore, once the state reaches the y = plane, it will satisfy x + z > 1, so Theorem 2 applies. 
Unfortunately, this method is very inefficient, since the difference r' + z' — 1 can be quadratically 
smaller than r + z — 1. 

For most states, of course, 4> (£ {kir/2 l : k,l G Z}. However, since we are dealing with mixed 
states, there is a simple solution. We can create any state po within the octahedron 0, by preparing 
the Pauli eigenstates with the appropriate probabilities. The set of "nice" longitudinal angles 
{kir/2 l : k, I G Z} is dense. So we can certainly choose a po (in fact, |+) will suffice) and mix it 
with p with appropriate probabilities to generate p' with a nice longitudal angle and such that the 
new state still has r' + z' > 1. This proves Theorem 3. 



4 Five-qubit distillation procedures 

In the proof of Theorem 1, Bravyi and Kitaev use a 15-qubit Reed-Muller code [KLZ96] that allows 
transversal application of exp(i|y) to prove universality when max{|x| + \z\, \x\ + \y\, \y\ + |z|} > 
1.015. (This method is actually equivalent to a scheme given by Knill [Kni04, Rei05].) 

To prove U(p(x, y, z)) when \x\ + \y\ + \z\ > 3/\/7, they use stabilizer operations to project five 
copies of p into the codespace of the five-qubit code, with stabilizer generators 

XZZXI,IXZZX, 

XIXZZ,ZXIXZ. t ' 

They then decode the logical qubit. Repeatedly applying this procedure, the state converges 
to \T)(T\ = / 9(-^,-^,-^) as long as x = y = z > 1/V3. Here |T) is the e l2n ^ 3 eigenstate of 

T = 2 [ZiXl a 1^0° rotation about 1, 1) on the Bloch sphere. In general of course x, 

y and z will be unequal. In that case, first apply single-qubit unitaries to move p into the positive 
octant, so x,y,z > 0. Then with equal probabilities 1/3 apply either /, T or T 2 . This symmetrizes 
/o's coordinates. 

A simple modification of the above procedure can give improvements for asymmetric p. For 
x = y < z, applying T to two of the five copies of p before distilling with the five-qubit code 
(without symmetrizing /j's coordinates) gives an improvement for < x = y < 1/ \/3, according to 
numerical iterations of the equation 



(x,y,z) ^ (x',y',z r ) 



+4xyz(x + y + z) 



I x — x (y + z ) + yz(y + z + 2x — xyz), 
— y 3 + y 2 (x 3 + z 3 ) — xz(x + z + 2y — xyz), 
\ z 3 - z 2 (x 3 + y 3 ) + xy{x + y + 2z - xyz), 



(15) 
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Table 1: Seven representative examples of states for which Theorem 5 holds. Each pi is \II 
plus the fraction / times the listed coordinates (tensor products implied). For example, p\ = 
\II + ^{IY + IZ-XX + YX + ZX). 
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IY 


IZ 


XI 


XX 


XY 
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YI 


YX 


YY 


YZ 


ZI 


ZX 


ZY 


ZZ 


1/12 
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1 





-1 











1 











1 








1/76 


1 


2 


-2 


2 


-6 


-1 


-1 
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-2 


-1 
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1 
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2 


2 


1/72 


2 


1 





-2 
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-3 


3 





2 


-2 


6 


-1 


-2 





1/76 
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5 


-1 


-2 


-1 


2 


2 
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-2 


-5 


-1 
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-1 


1/52 
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-3 


-1 


-3 


-3 


3 


-1 


3 


-3 


3 


-1 


-3 


3 


1/60 
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2 


2 


1 


-3 





1 


6 


-2 


-2 


1 


1 





-3 


1 


1/52 


1 


-1 


1 


-3 


3 


-1 


3 


1 


1 


-1 


-3 


-1 


3 


1 


-1 



The thick red curve in Figure 2(c) shows the distillable region, computed numerically, using this 
procedure in the x = y cross-section of the Bloch sphere. There is an improvement between 
x = y = 0.1956 and x = y = z = l/y/7 ~ 0.378. In particular, Theorem 4 is one special case. 

We have exhaustively searched all five-to-one-qubit stabilizer reductions, evaluating each on 
p(^(l, 1, 1)) and p(x = y = , z = ^~ 3 ^P ), and found no code performed better than the 

five-qubit code, either alone or with T applied to one or two of the qubits. 

5 Distillation of multi-qubit states 

The question W(|V>})? for multi-qubit pure states reduces to the same question for single-qubit 
pure states [Rei05]. In fact U{\ip)) for all pure states \tp) not stabilizer states, since all nonstabilizer 
single-qubit pure states IV'XV'I have \x\ + \y\ + \z\ > 1. Therefore all nonstabilizer pure states \tp) 
give universality, U(\ip}). 

A natural question is whether the same type of reduction holds even for mixed states. For 
brevity, call a state a nonstabilizer state if it is not a mixture of stabilizer states. Given a nonsta- 
bilizer multi-qubit mixed state, can it be reduced to a single-qubit nonstabilizer mixed state, using 
postselected stabilizer operations? 

It turns out that the answer is no. The multi-qubit question for mixed states does not reduce 
to the single-qubit question (Theorem 5). We present examples of nonstabilizer mixed states for 
which every two-to-one-qubit stabilizer reduction gives a mixture of stabilizer states. It is not 
known whether it is possible to achieve universality using these states in some more direct way. 

Seven representative examples are given in Table 1. In this section, we will prove that Theorem 5 
holds for each of these states. They were found by a geometrical argument considering the solid 
polyhedron O n of convex combinations of all n-qubit stabilizer states (for n = 2), a generalization 
of the solid octahedron 0\ consisting of mixtures of single-qubit stabilizer states. 

5.1 Stabilizer reductions from n qubits to one qubit 

To begin, we need to characterize the possible ways of using stabilizer operations to reduce an 
n-qubit state down to a single-qubit state. We will show that, without loss of generality, one may 
use only Clifford group unitaries and postselected Pauli measurements. It cannot be necessary to 



9 



add ancilla qubits, or to adapt processing according to measurement results. 

Lemma 1. For any postselected stabilizer procedure taking p an n-qubit state to a one-qubit nonsta- 
bilizer state, there exists another procedure taking p to a one-qubit nonstabilizer state that consists 
of an n-qubit Clifford unitary, followed by projecting the last n — 1 qubits onto |0™ _1 ). 

Proof. We first remark that without loss of generality all measurements in the reduction procedure 
are postselected. Indeed, if some measurement Mi is not postselected, then the final outcome will 
be a mixture of the outcome conditioned on an Mi = +1 measurement result and the outcome 
conditioned on Mi = —1. If the final outcome is not a mixture of stabilizer states, then at least 
one of these conditioned states must not have been. 

To complete the proof, we must only enforce the assumption that the reduction procedure works 
on the n qubits without requiring any extra ancillas. For any procedure using ancillas, there is 
another procedure that uses no ancillas, and that has identical output on all n-qubit stabilizer 
states, except possibly with a higher success probability. 

This fact is a consequence of the Gottesman-Knill stabilizer algebra formalism [Got98] . In order 
to track the evolution of an arbitrary system under stabilizer operations, it suffices to keep track 
of the stabilizer group and logical X and Z values for each of the n unfixed qubits. 

Assume that the procedure has the following form: 

1. In Phase 1, measure certain Pauli operators supported on the n original (logical) qubits. 
(Measuring a Pauli operator P means applying the projection \{I + P)-) 

2. Then prepare ancillas |0 m ), and in Phase 2 measure Pauli operators acting possibly on all 
n + m qubits. 

3. Finally, apply a Clifford unitary and trace out all but the first qubit. 

This form may be assumed since applying a unitary U then measuring P, has the same effect as 
measuring U^PU, then applying U. 

Next we show how to either eliminate or move to Phase 1 any measurements in Phase 2. 
Assume that Phase 1 has completed, leaving possibly an arbitrary state on the first n qubits; set 
the stabilizer group S to be all strings of Z or / supported on the ancilla qubits. Consider the first 
measurement in Phase 2, of a Pauli operator P. There are a few cases: 

1. If P or —P is in S, then the measurement either has no effect or succeeds with probability 
zero. Eliminate this measurement. 

2. If the operator P being measured commutes with all the stabilizer group elements but is not in 
the stabilizer group (P £ N(S)\S), then there exists Q £ S such that PQ is supported on the 
first n qubits. Measuring PQ has the same effect as measuring P. Remove the measurement 
of P from Phase 2 and add a measurement of PQ to Phase 1. 

3. If P anticommutes with a stabilizer group element, then for some ancillary qubit i, Pi S 
{X, Y}. Assume Pi = X; the other case is similar. Then the Pauli PXi has the identity 
on the ith qubit. Consider the unitary U = Ai(PXi) that applies PXi controlled by the ith 
qubit. U is a Clifford unitary such that UPU = Xi- Therefore applying U, measuring and 
applying IP to the result, is equivalent to measuring P. However, applying U has no effect, 
since the control-qubit is |0). Therefore, we can simply measure Xi, and delay application of 
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W to the end of the protocol, by commuting it past any remaining measurements in Phase 2. 
This measurement can be eliminated, since preparing |0) and measuring X is the same as 
simply preparing H\0) = |+), except with lower success probability 

Thus we may assume that there are no measurements in Phase 2. 

At the end of Phase 1, there remains at most one degree of freedom in the first n qubits, and 
this degree of freedom can be isolated with Clifford unitaries involving only those qubits. No extra 
ancillas are necessary. 

Moreover, the measurements in Phase 1 can be assumed to all commute with each other. If 
not, and P and Q are two successive anticommuting measurements, then as above P can be moved 
into just Zi for some i and Q can be assumed to be Xj. Then measuring Q was unnecessary, for 
one could have just applied Hi with the same effect, and higher success probability. □ 

The fact that ancilla preparation is unnecessary has two useful consequences. First, this leaves 
only a finite number of different stabilizer operations that can be applied to reduce an n-qubit state 
down to a single-qubit state. In our proof we will consider up to symmetries an exhaustive list of all 
possible stabilizer operations reducing two qubits to one. Second, it implies that a counterexample 
for n = 2 also gives a counterexample for n > 2; take the same state but adjoin |0 n_2 ). If the 
original state is not a mixture of stabilizer states, then nor is the n-qubit state — any such mixture 
would need to be trivial on the last n — 2 qubits. If no algorithm using stabilizer operations reduces 
the two-qubit state to a nonstabilizer single-qubit state, then the same will be true for the n-qubit 
state because it can in particular be prepared from the two-qubit state with ancilla preparation. 

5.2 Polyhedron O n of mixtures of n-qubit stabilizer states 

Any n-qubit density matrix p can be written as a real combination of the n-qubit Pauli operators, 
with the coefficient of a Pauli P given by cp(p) = ^ Tr(Pp). The coefficient of I is fixed to l/2 n 
since Tip = 1, but the other 4 n — 1 coordinates can vary. The state stabilized by the stabilizer 
group S has density matrix ^ X^seS ^' 1 - e -i ^ nas coordinates l/2 ra for S € S and elsewhere. 

Lemma 2. The number of different n-qubit stabilizer states is 



The number of different n-to-l-qubit stabilizer reductions, up to normalization and application of 
Cliffords to the output, is |(2 n - 1)N. 

Proof. The expression for N is a simplification of 



Here, the initial factor of 2 n is for the number of different syndromes given an unsigned set of 
stabilizers. The numerator is the number of ways of picking (in order) n nontrivial, independent, 
commuting generators. Given a stabilizer group, the denominator is the number of ways of picking 
in order generators. 

According to Lemma 1, the n-to-l-qubit stabilizer reductions correspond to choosing a stabilizer 
group of size 2 n_1 , up to normalization and application of Cliffords to the final, logical degree 




(16) 




(17) 
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Figure 3: Stabilizer groups for 15 two-qubit stabilizer states. The 60 total two-qubit stabilizer 
states can be enumerated by listing the four possible sign choices for each of these groups. For 
example, by switching the sign of XX for the last case, we get also the state \{II -XX +YY + ZZ). 



of freedom, i.e., picking n — 1 independent commuting generators. Therefore the count of such 
reductions is 

on-i (4"-l)(£ ~ 2 )(f -4)"-( 5 £r-2"- 2 ) hR) 
(2»-i -1)(2™- 1 -2)---(2 ra " 1 -2™- 2 ) ' 1 ; 

which simplifies to g(2 n — l)iV. The factor | arises because choosing one nontrivial element of the 
stabilizer group to be, say, logical +X overcounts by six. □ 

We are interested in convex combinations of stabilizer states. For n = 1, mixtures of the six 
stabilizer states form the closed, solid octahedron 0\ in three dimensions, shown fitting within the 
Bloch sphere in Figure 1. For n = 2, O2 has 60 vertices in a 15-dimensional space (Figure 3). 

Notice that n-to-l-qubit stabilizer reductions are linear from the original (4™ — l)-dimensional 
space into the four-dimensional space with basis /, X, Y, Z — the coordinate for the identity / is 
included because the trace of the output is not necessarily one without nonlinear renormalization. 
Therefore, the set of n-qubit states mapped into 0\ by a given stabilizer reduction is convex. 



5.3 Counterexamples for n = 2 

Specialize now to n = 2. Let us show that none of the states in Table 1 are stabilizer states, and 
that for each of them every two-to-one-qubit stabilizer reduction gives a stabilizer state. 

Every face of O2 has at least 15 vertices. Each of the counterexamples in Table 1 comes from 
finding a face F of O2 such that for every stabilizer reduction to a single qubit, F is not mapped 
to a face of the cone of 0\. That such faces of O2 can be found intuitively seems quite reasonable. 
Indeed, the alternative would be that, for every face, all 15 or more vertices are mapped to only 
four vertices in the output, three vertices of 0\ plus possibly 0. For, if a fourth vertex of 0\ is in 
the image, then necessarily two vertices must oppose one another and cancel out, so the image of 
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Table 2: Inequalities defining seven faces of the two-qubit stabilizer polyhedron 02- Each inequality 
is satisfied by all of the 60 two-qubit stabilizer states, but the jth inequality is violated by the jth 
state of Table 1. To explain the notation, for example the first inequality is jTrp(- II + IY + 
IZ — XX + YX + ZX) < 0. One can check that substituting the first state from Table 1 gives 1 /6. 



II IX IY IZ XI XX XY XZ YI YX YY YZ ZI ZX ZY ZZ 



-1 





1 


1 





-1 











1 











1 








-2 


1 


1 


-1 


1 


-2 








2 


-1 


-1 


1 





1 


1 


1 


-2 


1 


1 





-1 





2 


-1 


1 








-1 


2 


-1 


-1 





-2 


1 


2 





-1 





1 


1 


2 


-1 


-2 








1 








-2 


1 








-1 





-1 


-1 


1 





1 


-1 


1 





-1 


1 


-3 


2 


2 


1 


1 


-2 





1 


3 


-2 


-2 


-1 


1 





-2 


1 


-4 


2 


-1 


2 


-3 


3 


-2 


3 


1 


1 


-2 


-3 


-1 


3 


2 


-1 



F is not a face. Intuitively, this alternative possibility seems unlikely. However, it can in fact occur 
for some faces of O2; see Eq. (22) below. 

5.3.1 Counterexamples lie outside O2 

In order to prove that a certain state p is nonstabilizer, we need to check that p is indeed a valid 
density matrix, and also need to exhibit a separating hyperplane, i.e., an inequality satisfied by 
every stabilizer state but violated by p. In Table 2, we list seven inequalities satisfied by each of 
the 60 two-qubit stabilizer states, but violated by the respective states of Table 1. To verify this, 
compute inner products of the inequality with the state coordinates. 

5.3.2 Counterexamples reduce to mixtures of single-qubit stabilizer states 

Next, we claim that for each counterexample from Table 1, indeed every two-to-one-qubit stabilizer 
reduction outputs a mixture of stabilizer states. This can be checked by enumeration, since by 
Lemma 1 there are only a finite number of stabilizer reductions that need to be considered. In fact, 
there are exactly 30 stabilizer reductions to check. Each reduction corresponds to measuring one of 
the 15 nontrivial two-qubit Pauli operators, postselecting on outcome ±1. This leaves one degree 
of freedom uniquely defined up to single-qubit stabilizer operations. 

Of course, these checks can easily be done on a computer, but it is worth understanding the 
algebra involved. We will give a simple example that should elucidate the general situation. 

Consider, e.g., the state p = \ll + ±(IY + IZ - XX + YX + ZX). On measuring ZZ, 
postselecting on a +1 outcome, the unnormalized state becomes 

\{II + ZZ)p + ZZ) = l(H + ZZ + 1(1 Z + ZI - XX + YY + XY + YX)) 

= \\{II + ZZ)(II+\(IZ-XX + XY)) . (19) 

Applying a CNOT from the first qubit into the second in order to remove the fixed ZZ stabilizer, 
the state becomes \{I — ^(X + Y + Z)) (£> \ (I + Z). And indeed, the sum of the absolute values of 
the X, Y, Z components of the first qubit equals the I component, so the first qubit is a mixture of 
stabilizer states. 
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We could have shortened the above calculation. Notice that IZ, ZI, XX, —YY, XY, YX 
are the nontrivial two-qubit Paulis commuting with ZZ, besides ZZ itself. Then compare p's 
coordinates c/j + czz to 



CIZ + CZI\ + \CXX ~ CYY\ + \CXY + CYX 



(20) 



Both equal 1/4. 

The general procedure follows immediately; we need to compare the sums of the absolute 
values of the appropriate averages of the coordinates of p, where the averages are over the Paulis 
commuting with the measured operator P and differing by P. For example, projecting p by 
\{II — YZ), we need to compare cu — cyz to 



One notices from these calculations an interesting property of p. Let S be the set of nontrivial 
Paulis P such that cp{p) 7^ 0. Then, 

1. No two elements of S commute. 

2. The difference between any two elements of S commutes with neither. 

3. Exactly three elements of S commute with any nontrivial Pauli outside S. 

The first property implies that after measuring any element of S and postselecting on either +1 
or —1, the remaining degree of freedom is a fully mixed state — since the relevant sum of averaged 
coordinates is zero. The second property, which is a consequence of the first property, implies 
that after measuring a Pauli P not in S, it is impossible to have both cq and cpq nonzero for Q 
commuting with P. Therefore, the relevant comparison gives \ > + + ig' so the reduced state 
lies inside 0\. In fact, the third property implies that it lies on the boundary of 0\. 

Thus, these properties of p suffice to prove that any two-to-one-qubit stabilizer reduction on 
p outputs a mixture of stabilizer states. We do not know of similarly concise proofs for the other 
states in Table 1, but the calculation of course still reduces to summing averages of coordinates as 
in Eq. (20). 

5.3.3 Structure of O2 by computer analysis 

The counterexamples from Table 1 were found by using the cdd software for polyhedral computa- 
tions [Fuk]. On input the 60 two-qubit stabilizer states, cdd outputs the 22,320 external faces of 
their polyhedral convex hull. Most of these faces are symmetrical under two-qubit Clifford oper- 
ations; to determine a minimum set of representatives, we repeatedly chose a random two-qubit 
Clifford and reduced the faces modulo that symmetry. After a small number of iterations, only 
eight faces remained. Seven of these are those displayed in Table 2, and the eighth is given by 



None of these remaining faces are symmetrical to each other, because two-qubit Cliffords can only 
permute their coordinates by conjugation, possibly also changing the sign ±1 of a coordinate. 
Therefore, no two inequalities with differing II coordinates can be symmetrical to each other. The 
inequality of Eq. (22) and the first inequality of Table 2 cannot be symmetrical to each other 



CXX ~ CZY\ + \CXY + CZX\ + WlZ ~ Cyi 



(21) 



\Tip{-II + IX + IY + IZ + XI - XX -XY - XZ) < . 



(22) 
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since they have a different number of nonzero coordinates. No two of the inequalities in Table 2 
with II coordinate of —2 can be symmetrical since they either have different numbers of nonzero 
coordinates or have coordinates with different magnitudes. 

Next we chose an element in the center of each face and pushed it out from \ll as far as 
possible such that every two-to-one stabilizer reduction still outputted only stabilizer mixtures. 
This procedure worked for all the hyperplanes except that of Eq. (22). For the state in the center 
of this face, there does exist a two-to-one reduction leading to a state lying on the boundary of 0\ 
with equal x, y, z coordinates. Since we do not know the limits of distillation in this direction, it is 
unknown if applying this reduction to a state moved slightly outward leads to universality. 

5.4 Direct distillability of multi-qubit states 

We have not shown that the states in Table 1 cannot be distilled to give universality, only that 
such a distillation procedure could not start by reducing from two qubits to one. What more can 
be said about these examples, particularly the first one, having the most apparent structure? 

For more than two qubits, the calculation of whether or not the output lies inside 0\ still 
simplifies to an equation like Eq. (20), except within each of the three terms we average over a set 
of 2 n_1 coefficients. We have checked by exhaustive enumeration of all size-eight abelian subgroups 
of the Paulis on four qubits that for none of the states in Table 1 do two copies allow a reduction 
to a nonstabilizer state. We have also verified this for several pairs of different states from Table 1. 

Unfortunately, this does not come close to proving that li{p) is false for any of these states. 
Currently there are essentially almost no nontrivial upper bounds on the power of magic states 
distillation even in the single-qubit case. The only exception, to the author's knowledge, is recent 
work by Campbell and Browne [CB09]. They show that for any fixed x,y,z > 0, and any fixed 
stabilizer code, there exists an e > such that the distillation procedure based on this code fails 
for states p(fx,fy,fz) when |/| < x ]^ y € JrZ ■ The possibility remains, though, that e approaches as 
the code size increases. 

An interesting special case of the multi-qubit magic states distillation problem is the distillability 
of unentangled states. Say that we are allowed to prepare copies of each of a set of states pi, ... , pt- 
When do we have U((&iPi)l As a simple example, say we can prepare both p(x, y, z) and p(x, —y, z). 
Reflection across the xz plane is not a stabilizer operation, so these two states are generally not 
equivalent under stabilizer operations. Now letting II even = |00)(00| + |ii)(n|, 

n even (p(x, y, z) ® p(x, -y, 2))II even = \ ( ^ 2 +^ 2 *_^* 2 

where r 2 = x 2 + y 2 . Thus immediately the y coordinate is zeroed out, and we essentially have 
Eq. (7) with r in place of x. Thus directly U{p{x,y, z) ® p(x, —y, z)) provided max{ ^ x 2 + y 2 + 
z, x + \Jy 2 + z 2 } > 1. 

6 Distillation of unknown states 

Note that the success of the method for proving Theorem 3 depends strongly on us knowing the 
state p exactly, and that precisely the same state can be prepared repeatedly. Any small errors 
will be amplified quickly in the angle-doubling step. These assumptions are very strong, and 
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(23) 



are probably not physically justified. For practical applications, the definition of U{p) should be 
revisited to incorporate other conditions, including stability. The exact conditions may depend 
on the application. For example, in the threshold proof for concatenated distance-three codes in 
Ref. [Rei06b], dealing with the possibility of asymmetrically erroneous states requires: 

Theorem 7. For any constant 5 > 0, perfect stabilizer operations with adaptive classical control, 
together with the ability to prepare states p\ = p(fi/^/3, fi/V3, fi/V3) = \{l + + Y + Z)) 

with each fi unknown but at least (1 + 5)^3/7, gives quantum universality. 

Thus it is enough to have lower bounds on the fidelities of the prepared states with |T), and 
the states need not be identical. Moreover, in this case, the bound on the allowed error rate turns 
out to be the same for these nonidentical states as for the identical states assumed in Theorem 1. 

Note that some assumptions that are impractical at the physical level become relevant at the 
encoded level, when we try to apply magic states distillation on top of a stabilizer operation fault- 
tolerance scheme. An example is the assumption of perfect stabilizer operations. See Section 7.1. 
Additionally, it might not be surprising that the distillation model becomes more delicate as the 
limits of distillation are approached. Even delicate, artificial models can be of interest when we use 
magic states distillation to consider noise threshold upper bounds, in Section 7.2. Still, it is possible 
that the current techniques are unnecessarily fragile, and that there exist more direct, practical and 
efficient methods. 

Proof of Theorem 7. Theorem 7 is an extension of Theorem 1, from [BK05]. It is based on the 
same five-qubit code distillation scheme discussed in Section 4. Denote by [f] s the symmetric sum 
of s-tuples of the variables /i, . . . , /s, i.e., 

5C{1,2,3,4,5} ieS 
\S\=s 

Take five prepared states, <8>f =1 /0j, and use stabilizer operations to project into the codespace of 
the five-qubit code, then decode the logical qubit. A simple calculation gives that the probability 
of success is 

^success = Jg (3 + [/] ) • (25) 

The x, y, z coordinates of the output state, conditioned on success, are the same, equal to 

' ' " 1 ([/] 3 "2[/] 5 ) • (26) 



Psucccss 48 

These coordinates are negative, but can be rotated back to the positive octant with a stabilizer 
operation. Then the output state is p(fout/ V%, fout/V3, f ou t/ V%), where 

, m 3 -2m 5 r97 , 

/0Ut — 2 _|_ ry.j4 \ L 1 ) 

Now note that ^success is monotonely increasing in each fi, so distillation remains efficient when 
the fi are unequal. 

Also, simple algebra gives that df ou t/dfi > 0, so improving any of the input fidelities can only 
improve the output fidelity. Indeed, differentiate / out with respect to fa - other derivatives are 
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related by symmetry. Use the quotient rule d(a/b) = ^(bda — adb). The numerator, which does 
not involve /s, is, after simplifications, 

/1/2 (3 - /1/2/3/4 - hh ~ 1/1/2/I/I " |/l/2(/l + /!)) + symmetrical terms . (28) 

Each term is nonnegative when the fi £ [0, 1], implying that / out is monotone in each fi. □ 

The assumption of Theorem 7, that each prepared pi has equal x, y, z coordinates, can be 
guaranteed by randomly applying /, T, or T 2 each with probability 1/3 independently to each 
prepared p%. However, the ability to apply perfect Clifford unitaries may not imply the ability to 
apply a random perfect Clifford unitary. Depending on the application, this symmetrization may 
not be innocuous. 

For example, one important class of schemes for achieving fault tolerance is based on posts- 
election [Kni05, Rei06a, AGP08, Rei09]. In such schemes one can apply stabilizer operations to 
encoded qubits. These operations can be made arbitrarily accurate, but only conditioned on some 
heralded random event. That is, after trying to applying to apply an operation, the experimenter 
receives a message whether or not the operation succeeded. The details are not important here. 
The problem, though, is that the success probability can depend on the operation being applied. 
Thus, if we try to apply /, T, or T 2 each with probability 1/3, there is no guarantee that the 
probabilities conditioned on success will also be each 1/3. 

Now the fact that there is a working scheme for distilling pi® - • -®p n that starts by applying /, T 
or T 2 at random to each qubit implies that there exists some fixed sequence of unitaries U\ (g> • • • (g> U n 
that works at least as well. If we knew this sequence, we would have no need to randomize. 
Unfortunately, this sequence a priori could depend on the states pi in an arbitrary manner. Without 
knowing the states exactly, therefore, we cannot derandomize the distillation scheme. In fact, 
though, Ref. [Rei09] shows that the |T)-state distillation scheme can be derandomized, albeit at a 
cost: 

Theorem 8 ([Rei09, Theorem 1]). There exists a constant e > such that perfect stabilizer oper- 
ations with adaptive classical control, together with the ability to prepare (unknown) states p% each 
with fidelity > 1 — e with \T), gives quantum universality. 

More explicitly, pi = p{xi,yi,Zi), a sufficient condition for quantum universality is that for 
each i, 

maxmax - Xi \, - y i \ i |-L _ z .\j < 0.0527 . (29) 

In fact, by considering a decoding circuit for the five-qubit code, Ref. [Rei09] shows that not all 
stabilizer operations are necessary for simulating universal quantum computation. In Theorem 8, 
it suffices to have the operations CNOT, Hadamard, preparation of |0) and measurement in the 
|0), |1) basis. 

In the next section, we will summarize applications of magic states distillation to quantum fault- 
tolerance schemes. For lower bounds on the fault-tolerance threshold, statements like Theorem 7 
or Theorem 8 are typically needed. For placing limits on upper bounds, though, i.e., on attempts 
to prove the impossibility of reliable quantum computation at high noise rates, theorems such as 
Theorem 1, Theorem 3 and Theorem 4, which assume the state p is known exactly and can be 
prepared repeatedly, are still of use. 
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Figure 4: Knill's method for achieving universality by teleporting into an encoding. If the X and 
Z measurements are not postselected, then a logical correction (not shown) might be required. 

7 Fault-tolerance applications 

The main application of magic states distillation is to fault-tolerant quantum computing, or reliable 
computing in the presence of noisy gates. In particular, it clearly addresses the problem of achieving 
universality using noisy ancilla preparation. It does assume perfect stabilizer operations, though, 
which is certainly not realistic. This assumption can be justified in two different contexts: 

1. Certain arguments for upper-bounding the tolerable noise rate (or "threshold") assume that 
stabilizer operations are perfect and only the extra operation required for universality is noisy. 
This is optimistic, but sufficient for an upper bound. Section 7.2 below considers the relationship 
of magic states distillation to these arguments. 

2. Assumptions, like perfect stabilizer operations, which are unrealistic at the physical level can 
sometimes be justified at higher levels of encoding in a fault-tolerant concatenated coding scheme. 
In particular, it is possible that there are two different noise thresholds, one threshold for reliable 
stabilizer operations and a separate threshold for reliable universal quantum computation. If the 
physical noise is below the threshold for reliable stabilizer operations, then we can assume that 
stabilizer operations are in fact perfect — not at the physical level, but at some level of encoding. 

7.1 Fault-tolerance threshold lower bounds and estimates 

Assuming perfect stabilizer operations, magic states distillation gives universality by using noisy, 
single-qubit ancilla states. However, these noisy ancillas need to be prepared not at the physical 
level, but at the same higher level of encoding at which the logical stabilizer operations are reliable. 

How can one reliably encode noisy ancillas? Following Knill [Kni04, Kni05], we use perfect 
encoded stabilizer operations to create an encoded Bell pair ^(|00)l + |ll)z) (the subscript L 

denoting "logical"). Then we decode one half from the bottom up, ideally obtaining ^(|0)|0)l + 
Then prepare a qubit in a "magic" state like \H) or \T) and teleport it into the encoding, 
using a physical CNOT gate and two single-qubit measurements. See Figure 4. If there is no noise, 
then the output state will be \H)l or \T)l- At that point, both stabilizer operations and ancilla 
preparation can be done at an encoded level, so encoded universality follows. 

In the presence of noise, the noise too will be teleported into the encoding, i.e., into logical 
noise. As long as it is not too high, it can be distilled away at the encoded level, using (perfect) 
encoded stabilizer operations. This noise can come from three places: noise in the prepared single- 
qubit ancilla, noise in the physical teleportation circuit, and noise in the decoded half of the Bell 
pair. As long as the total noise from these sources is not too large, magic states distillation will 
succeed. Fortunately, magic states distillation can tolerate very high amounts of noise — for example, 
|(1 — a/3/7) ~ 17.2% depolarizing noise on \T), by Theorem 1. Therefore, it often turns out that 
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the threshold for universal quantum computation is the same as that for just stabilizer operations. 
The bottleneck is in achieving reliable stabilizer operations. 

This is a useful technique because it is easier to prove a noise threshold for stabilizer operations 
alone than for a full universal scheme. For fault-tolerance schemes using quantum stabilizer codes, 
only physical stabilizer operations are required to achieve reliable encoded stabilizer operations. 
Stabilizer operations are easy to work with because Pauli errors propagate through them linearly. 
Moreover, these operations can be simulated efficiently, by the Gottesman-Knill theorem — meaning 
that we can run classical simulations to estimate the fault-tolerance threshold for stabilizer oper- 
ations. Steane and Knill, among many others, have run extensive simulations of this type to 
determine threshold estimates [Ste03, Kni05]. 

Threshold proofs and estimates simplify using this reduction because magic states distillation 
lets us skip the fault-tolerance hierarchy for universal quantum computing operations. That is, 
with a concatenated-coding fault-tolerance scheme, stabilizer operations at k levels of encoding are 
implemented in terms of stabilizer operations at k — 1 levels of encoding. Typically, the additional 
operation needed to obtain universality is also implemented at level k in terms of the same operation 
and stabilizer operations at level k — 1. Using magic states distillation, though, we can instead 
obtain universality at level k using only level-k stabilizer operations and some sort of universality 
operation at level 0, in this case, certain noisy ancilla preparations. 

This reduction, from universal fault tolerance to stabilizer operation fault tolerance, does not 
necessarily always work, because it requires that we be able to decode one half of an encoded Bell 
pair without introducing too much noise. It is possible that we can prepare a perfect encoded Bell 
pair but cannot decode half of it without losing control of the noise. In the schemes that have 
been studied [Kni05, ReiOGb], however, this has not seemed to happen. The decoding operation 
was straightforward to analyze rigorously in Ref. [ReiOGb] because decoding blocks independently 
cannot create correlated errors, which are the main obstacle to proving a threshold for stabilizer 
operations. 

There are different tricks that might be useful, too, in decoding a state; for example, instead of 
correcting any detected errors, one might postselect on no detected errors. If any errors are detected, 
then throw the Bell pair away and start over. This can adversely impact the overhead if applied 
injudiciously, but it might be a reasonable technique to apply in a limited fashion. For example, in 
a recursive decoding scheme, one might only postselect on no detected errors in decoding the last 
few levels. 

Some technical concerns arise in applying magic states distillation to fault-tolerance. For exam- 
ple, besides knowing U(p), we are also interested in the stability of the procedure, either because 
we cannot repeatedly prepare exact copies of p or perhaps because our knowledge of p has limited 
accuracy [Rei06b]. These concerns are addressed by Theorem 7 and Theorem 8. 

Finally, note that while fault-tolerance schemes often use concatenated coding, and magic states 
distillation can also be phrased as projection into a certain code space, the two codes need bear no 
relationship to each other. 

7.2 Fault-tolerance threshold upper bounds 

Giving upper bounds for the fault-tolerance threshold, with a given set of operations and a given 
noise model, is difficult. There have been only a few approaches, and these tend to be tied delicately 
to a particular model. For example, Aharonov et al. show that a useful noisy quantum circuit 
can only have logarithmic depth if fresh ancillas are not allowed to be introduced during the 
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computation [AB096, ABOIN96]. In practical quantum computing schemes, though, it is possible 
to initialize ancillas during the computation. Razborov shows that the tolerable noise rate, of 
circuits with more than logarithmic depth, can be at most 1/2 for a gate set with gates of fan-in 
two [Raz04]. This approach does not allow for noiseless classical control based on measurement 
results, though, which is common in proposed experimental implementations of quantum computers. 
Moreover, interesting problems, including factoring, can be solved with logarithmic-depth quantum 
circuits, aided by classical computation [CWOO]. 

Harrow and Nielsen [HN03] ask how much depolarizing noise can be tolerated by a two-qubit 
gate before it loses its power to generate entanglement; they find that the CNOT is the most resilient 
two-qubit gate, but does not tolerate independent depolarizing noise higher than 74%. (Virmani, 
Huelga and Plenio improve this to 2/3 with a more careful entanglement requirement [VHP05].) 
Against simultaneous depolarizing noise, they find that the threshold is at most 8/9, or 1/2 for a 
somewhat-adversarial noise model, an optimal noise process including correlated two-qubit noise. 

Virmani et al. [VHP05] assume that stabilizer operations, including the CNOT gate, are perfect, 
and ask how much noise can be tolerated in an additional gate used to achieve universality. They 
show that the tt/8 gate, exp(i^Z), with (v2 — l)/2\/2 ~ 14.6% or more worst-case noise, or twice 
that amount of dephasing noise, becomes a convex combination of stabilizer operations and so this 
gate set can be simulated classically. Among all the rotations exp(z|Z), the tt/8 gate is the most 
resistant to dephasing noise according to their criterion. The advantage of this approach, and also 
that of Harrow and Nielsen, is that it easily allows for the incorporation of noiseless classical control 
into the model. 

Buhrman et al. extend these results to a depolarizing noise channel [BCL + 06]. Again, assume 
that stabilizer operations are perfect, and assume that a noisy single-qubit gate is used to achieve 
universality. They show that the tt/8 gate with (6 — 2\/2)/7 ~ 45.3% or more depolarizing noise 
becomes a convex combination of stabilizer operations. And again, the tt/8 gate is the most noise- 
resistant single-qubit gate. Therefore, 45.3% is an upper bound on the noise threshold in this 
model. 

Magic states distillation (Theorem 6) shows the limit of the techniques of [VHP05, BCL + 06]. 
Both their upper bounds are tight; with any less noise one gets universal quantum computation. 
Since from Section 7.1 we typically expect the threshold bottleneck to be in achieving perfect 
stabilizer operations, this may not be very surprising. 

Proof of Theorem 6. The upper bounds are due to [VHP05, BCL + 06]. 

The tt/8 gate with less than (^2 - l)/2^/2 worst-case probabilistic noise, or twice that amount 
of dephasing noise, takes |+) to a state p(x, x, 0) with x > 1/2, implying universality together with 
perfect stabilizer operations by Theorem 2. 

The tt/8 gate with 45% depolarizing noise, however, takes |+) to a state well inside the octahe- 
dron 0\ of Figure 1. Instead, inspired by the Jamiolkowski isomorphism, apply the noisy gate to 
the second half of a Bell pair, which is a stabilizer state. If the depolarizing noise rate is less than 
(6 — 2\/2)/7, then the output two-qubit state lies outside of O2 (defined in Section 5.2). Moreover, 
there does exist a two-to-one-qubit stabilizer reduction giving a state outside 0\\ simply apply the 
parity-check procedure of Section 2. Indeed, the renormalized output state at a depolarizing noise 
rate of (6 — 2\/2) /7 — e is computed to have x, y, z coordinates of 




- ((1 + 2^2)(2 + 7e), -2^2(1 + 2^2 + 7e), 0) 



(30) 



for which \x\ + \y\ > when e > 0. By Theorem 2, this state gives universality. 



□ 
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There are a number of questions still to answer about these threshold upper bound results. We 
will just list a few of them: 

1. Assume £ is a partly depolarized single-qubit unitary £(p) = (1 — p)U pU^ + p| TV p (or, 
more generally, a mixture of unitaries £(p) = ^2iPiUipUj), but is not a mixture of Cliffords. 
Does £ together with stabilizer operations suffice for achieving universality? Certainly sim- 
ply applying £ to a single-qubit Pauli eigenstate may not suffice, for it could create, e.g., 
p(x,y,z) with x = y = z G (|, 4=] which we do not know not how to distill. None of the 
counterexamples from Section 5 can be written as a single-qubit unitary on a stabilizer state 
followed by depolarizing noise. 1 Note that the last six elements in Figure 3 correspond to the 
24 single-qubit Clifford unitaries under the Jamiolkowski isomorphism, up to sign/syndrome 
choices. 

For a noisy gate £, stabilizer operations with adaptive classical control and £ gives universality 
if and only if W((l <g> for 1^) = Tsd 00 ) + I 11 ))' Indeed ' the " if " direction is by 

definition ollA{p). Only if: £ being universal with stabilizer operations implies that we can 
efficiently approximate \T) to arbitrary accuracy, i.e., using poly(log(l/<5)) gates to obtain 
precision 5. In particular, we can get within a constant of \T) using only a constant number of 
applications of £. Now assume we have only stabilizer operations and repeated preparation 
of p = (1 (g> In our approximate preparation of |T), replace every application 

of £ by teleportation into p, conditioning each measurement outcome so no correction is 
required. Since the success probability of each teleportation is 1/4 and only a constant 
number of teleportations are required, the expected overhead is a constant. Once we have an 
approximation of |T), we obtain universality by distilling it, e.g., using Theorem 7. 

In fact, if £ is the ir/8 gate with depolarizing noise, this equivalence is much simpler; tele- 
porting into (1 (g) £ can be accomplished deterministically. Since the ir/8 gate is in 
C3, the set of operators that conjugate Paulis to Cliffords, any required correction is always 
a Clifford and can be applied [GC99]. The depolarizing noise commutes past the correction. 

Therefore, without loss of generality we may assume that £ is applied to the second half 
of |\&). But can one always distill the resulting state by first applying a two-to-one-qubit 
stabilizer reduction? 

2. If we do not assume perfect CNOT gates, then can we reduce the amount of error allowed 
on the single-qubit gate, such as a ir/8 gate? This is certainly sometimes the case if our 
gate set consists of preparation of noisy ancillas. For example, if the CNOT model is bitwise 
independent depolarizing channels prior to the CNOT, then we of course will not achieve 
universality if the total depolarizing noise on the ancilla moves it into the Bloch sphere. This 
criterion is probably not tight, however. Can we get similar results for more interesting noise 
models, or for the 7r/8 gate? Plenio and Virmani have recently studied this problem under 
the assumption that the fault-tolerance scheme uses the magic state ancillas in a certain 
way [PV08]. They obtain noise threshold upper bounds, subject to this assumption, that are 
remarkably close to estimated threshold lower bounds. 

1 Indeed, each counterexample has a nonzero coordinate for one of IX, IY, IZ, XI, YI or ZI, and it is simple to 
prove that the same is true after applying any two-qubit Clifford — the interesting case is the first example of Table 1. 
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8 Conclusion 



Magic states distillation is important both because it describes the power of stabilizer operations, 
in terms of what more is necessary for achieving full quantum universality, and because of its ap- 
plication to quantum fault tolerance. We have given improved magic states distillation procedures, 
reducing the set of single-qubit mixed states p for which lA{p) is unknown. We have also introduced 
the multi-qubit magic states distillation problem, and proved that it does not reduce to the single- 
qubit case. We have used magic states distillation to prove that two noise threshold upper bounds 
are in fact tight. 

There remain many open problems, including most of those described in earlier papers [BK05, 
Rei05]. Are there two-qubit nonstabilizer states that cannot be distilled to a single-qubit nonsta- 
bilizer state? We have given some candidate states, one of which might have sufficient structure 
for an analysis. In particular, it is interesting to specialize this question to those two-qubit states 
arising from the Jamiolkowski isomorphism. 

Research conducted while the author was at the University of California, Berkeley, supported 
in part by NSF ITR Grant CCR-0121555, and ARO Grant DAAD 19-03-1-0082. 
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